HMC at SemEval-2016 Task 11: Identifying Complex Words Using Depth-limited Decision Trees

نویسندگان

  • Maury Quijada
  • Julie Medero
چکیده

We present two systems created for SemEval2016s Task 11: Complex Word Identification. Our two systems, a regression tree and decision tree, were trained with a word’s unigram and lemma word counts, average ageof-acquisition, and a measure of concreteness. The systems ranked 5th and 6th, respectively, on the test set by G-score (the harmonic mean between accuracy and recall). With the regression tree’s predictions earning a G-score of 0.766, and the decision tree’s earning 0.765, the two systems scored within 1 percent of the score of the best-performing system in the task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

JU_NLP at SemEval-2016 Task 11: Identifying Complex Words in a Sentence

The complex word identification task refers to the process of identifying difficult words in a sentence from the perspective of readers belonging to a specific target audience. This task has immense importance in the field of lexical simplification. Lexical simplification helps in improving the readability of texts consisting of challenging words. As a participant of the SemEval-2016: Task 11 s...

متن کامل

MAZA at SemEval-2016 Task 11: Detecting Lexical Complexity Using a Decision Stump Meta-Classifier

This paper describes team MAZA entries for the 2016 SemEval Task 11: Complex Word Identification (CWI). The task is a binary classification task in which systems are trained to predict whether a word in a sentence is considered to be complex or not. We developed our two systems for this task based on classifier stacking using decision stumps and decision trees. Our best system, using contextual...

متن کامل

SemEval 2016 Task 11: Complex Word Identification

We report the findings of the Complex Word Identification task of SemEval 2016. To create a dataset, we conduct a user study with 400 non-native English speakers, and find that complex words tend to be rarer, less ambiguous and shorter. A total of 42 systems were submitted from 21 distinct teams, and nine baselines were provided. The results highlight the effectiveness of Decision Trees and Ens...

متن کامل

Sensible at SemEval-2016 Task 11: Neural Nonsense Mangled in Ensemble Mess

This paper describes our submission to the Complex Word Identification (CWI) task in SemEval-2016. We test an experimental approach to blindly use neural nets to solve the CWI task that we know little/nothing about. By structuring the input as a series of sequences and the output as a binary that indicates 1 to denote complex words and 0 otherwise, we introduce a novel approach to complex word ...

متن کامل

LTG at SemEval-2016 Task 11: Complex Word Identification with Classifier Ensembles

We present the description of the LTG entry in the SemEval-2016 Complex Word Identification (CWI) task, which aimed to develop systems for identifying complex words in English sentences. Our entry focused on the use of contextual language model features and the application of ensemble classification methods. Both of our systems achieved good performance, ranking in 2nd and 3rd place overall in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016